Metadata

Close
Metadata

%0 Conference Proceedings
%4 sid.inpe.br/sibgrapi/2015/08.01.16.12
%2 sid.inpe.br/sibgrapi/2015/08.01.16.12.58
%T Manifold learning using Isomap applied to spatial audio personalization
%D 2015
%A Grijalva, Felipe,
%A Goldenstein, Siome,
%A Florencio, Dinei,
%A Martini, Luiz,
%@affiliation School of Electrical and Computer Engineering, University of Campinas, Campinas, Brazil.
%@affiliation Institute of Computing, University of Campinas, Campinas, Brazil.
%@affiliation Multimedia, Interaction and Communication Group, Microsoft Research, Redmond, WA, USA.
%@affiliation School of Electrical and Computer Engineering, University of Campinas, Campinas, Brazil.
%E Segundo, Maurício Pamplona,
%E Faria, Fabio Augusto,
%B Conference on Graphics, Patterns and Images, 28 (SIBGRAPI)
%C Salvador, BA, Brazil
%8 26-29 Aug. 2015
%I Sociedade Brasileira de Computação
%J Porto Alegre
%S Proceedings
%K Isomap, Manifold Learning, Spatial Audio.
%X As augmented reality applications become more important, there is increasing effort in spatial audio research. The term spatial audio or 3D sound refers to techniques where a person's anatomy is modeled as digital filters. By filtering a sound source with these filters, a listener is capable of perceiving a sound as though it were reproduced at a specific spatial location. In the frequency domain, these filters are known as Head-Related Transfer Functions(HRTFs). A significant problem for the implementation of 3D sound systems is the fact that spectral features of HRTFs differ widely among individuals due to their anatomical differences. Thus, it is necessary to personalize them to guarantee high quality sound perception. With this aim, we introduce a new anthropometric-based method for customizing of HRTFs in the horizontal plane using manifold learning. The method uses Isomap, artificial neural networks (ANN), and a neighborhood-based reconstruction procedure. We first modify Isomap's graph construction step to emphasize the individuality of HRTFs and perform a customized nonlinear dimensionality reduction of the HTRFs. We then use an ANN to model the nonlinear relationship between anthropometric features and our low-dimensional HRTFs. Finally, we use a neighborhood-based reconstruction approach to reconstruct the HRTF from the estimated low-dimensional version. Simulations show that our approach performs better than PCA (Principal Component Analysis) and confirm that Isomap is capable of discovering the underlying nonlinear relationships of sound perception.
%@language en
%3 WTD sibgrapi 2015 Camera Ready.pdf